2,978 research outputs found
Collective excitations and instabilities in multi-layer stacks of dipolar condensates
We analyze theoretically the collective mode dispersion in multi-layer stacks
of two dimensional dipolar condensates and find a strong enhancement of the
roton instability. We discuss the interplay between the dynamical instability
and roton softening for moving condensates. We use our results to analyze the
decoherence rate of Bloch oscillations for systems in which the s-wave
scattering length is tuned close to zero using Feshbach resonance. Our results
are in qualitative agreement with recent experiments of Fattori {\it et al.} on
K atoms.Comment: 5 pages with 3 figure
PreCog: Improving Crowdsourced Data Quality Before Acquisition
Quality control in crowdsourcing systems is crucial. It is typically done
after data collection, often using additional crowdsourced tasks to assess and
improve the quality. These post-hoc methods can easily add cost and latency to
the acquisition process--particularly if collecting high-quality data is
important. In this paper, we argue for pre-hoc interface optimizations based on
feedback that helps workers improve data quality before it is submitted and is
well suited to complement post-hoc techniques. We propose the Precog system
that explicitly supports such interface optimizations for common integrity
constraints as well as more ambiguous text acquisition tasks where quality is
ill-defined. We then develop the Segment-Predict-Explain pattern for detecting
low-quality text segments and generating prescriptive explanations to help the
worker improve their text input. Our unique combination of segmentation and
prescriptive explanation are necessary for Precog to collect 2x more
high-quality text data than non-Precog approaches on two real domains
QFix: Diagnosing errors through query histories
Data-driven applications rely on the correctness of their data to function
properly and effectively. Errors in data can be incredibly costly and
disruptive, leading to loss of revenue, incorrect conclusions, and misguided
policy decisions. While data cleaning tools can purge datasets of many errors
before the data is used, applications and users interacting with the data can
introduce new errors. Subsequent valid updates can obscure these errors and
propagate them through the dataset causing more discrepancies. Even when some
of these discrepancies are discovered, they are often corrected superficially,
on a case-by-case basis, further obscuring the true underlying cause, and
making detection of the remaining errors harder. In this paper, we propose
QFix, a framework that derives explanations and repairs for discrepancies in
relational data, by analyzing the effect of queries that operated on the data
and identifying potential mistakes in those queries. QFix is flexible, handling
scenarios where only a subset of the true discrepancies is known, and robust to
different types of update workloads. We make four important contributions: (a)
we formalize the problem of diagnosing the causes of data errors based on the
queries that operated on and introduced errors to a dataset; (b) we develop
exact methods for deriving diagnoses and fixes for identified errors using
state-of-the-art tools; (c) we present several optimization techniques that
improve our basic approach without compromising accuracy, and (d) we leverage a
tradeoff between accuracy and performance to scale diagnosis to large datasets
and query logs, while achieving near-optimal results. We demonstrate the
effectiveness of QFix through extensive evaluation over benchmark and synthetic
data
Supervised multiview learning based on simultaneous learning of multiview intact and single view classifier
Multiview learning problem refers to the problem of learning a classifier
from multiple view data. In this data set, each data points is presented by
multiple different views. In this paper, we propose a novel method for this
problem. This method is based on two assumptions. The first assumption is that
each data point has an intact feature vector, and each view is obtained by a
linear transformation from the intact vector. The second assumption is that the
intact vectors are discriminative, and in the intact space, we have a linear
classifier to separate the positive class from the negative class. We define an
intact vector for each data point, and a view-conditional transformation matrix
for each view, and propose to reconstruct the multiple view feature vectors by
the product of the corresponding intact vectors and transformation matrices.
Moreover, we also propose a linear classifier in the intact space, and learn it
jointly with the intact vectors. The learning problem is modeled by a
minimization problem, and the objective function is composed of a Cauchy error
estimator-based view-conditional reconstruction term over all data points and
views, and a classification error term measured by hinge loss over all the
intact vectors of all the data points. Some regularization terms are also
imposed to different variables in the objective function. The minimization
problem is solve by an iterative algorithm using alternate optimization
strategy and gradient descent algorithm. The proposed algorithm shows it
advantage in the compression to other multiview learning algorithms on
benchmark data sets
Quantum fluids of self-assembled chains of polar molecules
We study polar molecules in a stack of strongly confined pancake traps. When
dipolar moments point perpendicular to the planes of the traps and are
sufficiently strong, the system is stable against collapse but attractive
interaction between molecules in different layers leads to the formation of
extended chains of molecules, analogously to the chaining phenomenon in
classical rheological electro- and magnetofluids. We analyze properties of the
resulting quantum liquid of dipolar chains and show that only the longest
chains undergo Bose-Einstein condensation with a strongly reduced condensation
temperature. We discuss several experimental methods for studying chains of
dipolar molecules.Comment: 4 pages and 3 figures, final version as publishe
Disordered Bose-Einstein Condensates in Quasi One-Dimensional Magnetic Microtraps
We analyze effects of a random magnetic potential in a microfabricated
waveguide for ultra-cold atoms. We find that the shape and position
fluctuations of a current carrying wire induce strongly disordered potential
that is quasiperiodic with a lengthscale set by the atom-wire separation. The
theory is used to explain quantitatively the experimentally observed
fragmentation of the quasi one-dimensional Bose-Einstein condensates.
Furthermore, we show that nonlinear dynamics can be used to provide important
insights into the nature of the strongly fragmented condensates. We argue that
a quantum phase transition from the superfluid to the insulating Bose glass
phase may be reached and detected under the realistic experimental conditions.Comment: Revised version. This paper has been selected for the March 1, 2004
issue of Virtual Journal of Nanoscale Science & Technology
(http://www.vjnano.org
PRESTO: Probabilistic Cardinality Estimation for RDF Queries Based on Subgraph Overlapping
In query optimisation accurate cardinality estimation is essential for
finding optimal query plans. It is especially challenging for RDF due to the
lack of explicit schema and the excessive occurrence of joins in RDF queries.
Existing approaches typically collect statistics based on the counts of triples
and estimate the cardinality of a query as the product of its join components,
where errors can accumulate even when the estimation of each component is
accurate. As opposed to existing methods, we propose PRESTO, a cardinality
estimation method that is based on the counts of subgraphs instead of triples
and uses a probabilistic method to estimate cardinalities of RDF queries as a
whole. PRESTO avoids some major issues of existing approaches and is able to
accurately estimate arbitrary queries under a bound memory constraint. We
evaluate PRESTO with YAGO and show that PRESTO is more accurate for both simple
and complex queries
TritanDB: Time-series Rapid Internet of Things Analytics
The efficient management of data is an important prerequisite for realising
the potential of the Internet of Things (IoT). Two issues given the large
volume of structured time-series IoT data are, addressing the difficulties of
data integration between heterogeneous Things and improving ingestion and query
performance across databases on both resource-constrained Things and in the
cloud. In this paper, we examine the structure of public IoT data and discover
that the majority exhibit unique flat, wide and numerical characteristics with
a mix of evenly and unevenly-spaced time-series. We investigate the advances in
time-series databases for telemetry data and combine these findings with
microbenchmarks to determine the best compression techniques and storage data
structures to inform the design of a novel solution optimised for IoT data. A
query translation method with low overhead even on resource-constrained Things
allows us to utilise rich data models like the Resource Description Framework
(RDF) for interoperability and data integration on top of the optimised
storage. Our solution, TritanDB, shows an order of magnitude performance
improvement across both Things and cloud hardware on many state-of-the-art
databases within IoT scenarios. Finally, we describe how TritanDB supports
various analyses of IoT time-series data like forecasting
How fear of future outcomes affects social dynamics
Mutualistic relationships among the different species are ubiquitous in
nature. To prevent mutualism from slipping into antagonism, a host often
invokes a "carrot and stick" approach towards symbionts with a stabilizing
effect on their symbiosis. In open human societies, a mutualistic relationship
arises when a native insider population attracts outsiders with benevolent
incentives in hope that the additional labor will improve the standard of all.
A lingering question, however, is the extent to which insiders are willing to
tolerate outsiders before mutualism slips into antagonism. To test the
assertion by Karl Popper that unlimited tolerance leads to the demise of
tolerance, we model a society under a growing incursion from the outside.
Guided by their traditions of maintaining the social fabric and prizing
tolerance, the insiders reduce their benevolence toward the growing
subpopulation of outsiders but do not invoke punishment. This reduction of
benevolence intensifies as less tolerant insiders (e.g., "radicals") openly
renounce benevolence. Although more tolerant insiders maintain some level of
benevolence, they may also tacitly support radicals out of fear for the future.
If radicals and their tacit supporters achieve a critical majority, herd
behavior ensues and the relation between the insider and outsider
subpopulations turns antagonistic. To control the risk of unwanted social
dynamics, we map the parameter space within which the tolerance of insiders is
in balance with the assimilation of outsiders, the tolerant insiders maintain a
sustainable majority, and any reduction in benevolence occurs smoothly. We also
identify the circumstances that cause the relations between insiders and
outsiders to collapse or that lead to the dominance of the outsiders.Comment: 10+5 pages, 5+3 figures, Supporting Information include
CLAMShell: Speeding up Crowds for Low-latency Data Labeling
Data labeling is a necessary but often slow process that impedes the
development of interactive systems for modern data analysis. Despite rising
demand for manual data labeling, there is a surprising lack of work addressing
its high and unpredictable latency. In this paper, we introduce CLAMShell, a
system that speeds up crowds in order to achieve consistently low-latency data
labeling. We offer a taxonomy of the sources of labeling latency and study
several large crowd-sourced labeling deployments to understand their empirical
latency profiles. Driven by these insights, we comprehensively tackle each
source of latency, both by developing novel techniques such as straggler
mitigation and pool maintenance and by optimizing existing methods such as
crowd retainer pools and active learning. We evaluate CLAMShell in simulation
and on live workers on Amazon's Mechanical Turk, demonstrating that our
techniques can provide an order of magnitude speedup and variance reduction
over existing crowdsourced labeling strategies
- …